The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
In this paper we revisit endless online level generation with the recently proposed experience-driven procedural content generation via reinforcement learning (EDRL) framework, from an observation that EDRL tends to generate recurrent patterns. Inspired by this phenomenon, we formulate a notion of state space closure, which means that any state that may appear in an infinite-horizon online generation process can be found in a finite horizon. Through theoretical analysis we find that though state space closure arises a concern about diversity, it makes the EDRL trained on a finite-horizon generalised to the infinite-horizon scenario without deterioration of content quality. Moreover, we verify the quality and diversity of contents generated by EDRL via empirical studies on the widely used Super Mario Bros. benchmark. Experimental results reveal that the current EDRL approach's ability of generating diverse game levels is limited due to the state space closure, whereas it does not suffer from reward deterioration given a horizon longer than the one of training. Concluding our findings and analysis, we argue that future works in generating online diverse and high-quality contents via EDRL should address the issue of diversity on the premise of state space closure which ensures the quality.
translated by 谷歌翻译
Adversarial attacks can easily fool object recognition systems based on deep neural networks (DNNs). Although many defense methods have been proposed in recent years, most of them can still be adaptively evaded. One reason for the weak adversarial robustness may be that DNNs are only supervised by category labels and do not have part-based inductive bias like the recognition process of humans. Inspired by a well-known theory in cognitive psychology -- recognition-by-components, we propose a novel object recognition model ROCK (Recognizing Object by Components with human prior Knowledge). It first segments parts of objects from images, then scores part segmentation results with predefined human prior knowledge, and finally outputs prediction based on the scores. The first stage of ROCK corresponds to the process of decomposing objects into parts in human vision. The second stage corresponds to the decision process of the human brain. ROCK shows better robustness than classical recognition models across various attack settings. These results encourage researchers to rethink the rationality of currently widely-used DNN-based object recognition models and explore the potential of part-based models, once important but recently ignored, for improving robustness.
translated by 谷歌翻译
Neural networks are susceptible to data inference attacks such as the membership inference attack, the adversarial model inversion attack and the attribute inference attack, where the attacker could infer useful information such as the membership, the reconstruction or the sensitive attributes of a data sample from the confidence scores predicted by the target classifier. In this paper, we propose a method, namely PURIFIER, to defend against membership inference attacks. It transforms the confidence score vectors predicted by the target classifier and makes purified confidence scores indistinguishable in individual shape, statistical distribution and prediction label between members and non-members. The experimental results show that PURIFIER helps defend membership inference attacks with high effectiveness and efficiency, outperforming previous defense methods, and also incurs negligible utility loss. Besides, our further experiments show that PURIFIER is also effective in defending adversarial model inversion attacks and attribute inference attacks. For example, the inversion error is raised about 4+ times on the Facescrub530 classifier, and the attribute inference accuracy drops significantly when PURIFIER is deployed in our experiment.
translated by 谷歌翻译
可以测量接触物体的3D几何形状的基于视觉的触觉传感器对于机器人执行灵巧的操纵任务至关重要。但是,现有的传感器通常很复杂,可以制造和细腻以扩展。在这项工作中,我们从小地利用了半透明弹性体的反射特性来设计一种名为DTACT的强大,低成本且易于制作的触觉传感器。DTACT从捕获的触觉图像中所示的黑暗中精确测量了高分辨率3D几何形状,仅具有单个图像进行校准。与以前的传感器相反,在各种照明条件下,DTACT是可靠的。然后,我们构建了具有非平面接触表面的DTACT原型,并以最少的额外努力和成本。最后,我们执行了两项智能机器人任务,包括使用DTACT进行姿势估计和对象识别,其中DTACT在应用中显示出巨大的潜力。
translated by 谷歌翻译
知识图(kg)是近年来突出的知识表示形式。因为它集中在名义实体及其关系上,所以传统的知识图本质上是静态和百科全书。在此基础上,事件知识图(事件kg)通过文本处理对时间和空间动力进行建模,以促进下游应用程序,例如提问,建议和智能搜索。另一方面,现有的KG研究主要集中在文本处理和静态事实上,而忽略了照片,电影和预训练的神经网络中包含的大量动态行为信息。此外,没有努力将行为智能信息包括到深入强化学习(DRL)和机器人学习的知识图中。在本文中,我们提出了一种新颖的动态知识和技能图(KSG),然后我们基于CN-DBPEDIA开发了基本和特定的KSG。节点分为实体和属性节点,其中包含代理,环境和技能(DRL策略或策略表示)的实体节点,以及包含实体描述,预训练网络和离线数据集的属性节点。 KSG可以在各种环境中搜索不同代理的技能,并提供可转移的信息以获取新技能。这是我们意识到的第一项研究,研究了动态的KSG,以进行技能检索和学习。新技能学习的广泛实验结果表明,KSG提高了新的技能学习效率。
translated by 谷歌翻译
玻璃在现实世界中非常普遍。受玻璃区域的不确定性以及玻璃背后的各种复杂场景的影响,玻璃的存在对许多计算机视觉任务构成了严重的挑战,从而使玻璃分割成为重要的计算机视觉任务。玻璃没有自己的视觉外观,而只能传输/反映其周围环境的外观,从而与其他常见对象根本不同。为了解决此类具有挑战性的任务,现有方法通常会探索并结合深网络中不同特征级别的有用线索。由于存在级别不同的特征之间的特征差距,即,深层特征嵌入了更多高级语义,并且更好地定位目标对象,而浅层特征具有更大的空间尺寸,并保持更丰富,更详细的低级信息,因此,将这些特征融合到天真的融合将导致亚最佳溶液。在本文中,我们将有效的特征融合到两个步骤中,以朝着精确的玻璃分割。首先,我们试图通过开发可区分性增强(DE)模块来弥合不同级别特征之间的特征差距,该模块使特定于级别的特征成为更具歧视性的表示,从而减轻了融合不兼容的特征。其次,我们设计了一个基于焦点和探索的融合(FEBF)模块,以通过突出显示常见并探索级别差异特征之间的差异,从而在融合过程中丰富挖掘有用的信息。
translated by 谷歌翻译
自动标记实践问题的知识点是管理问题基础并改善教育的自动化和智能的基础。因此,研究实践问题的自动标记技术具有很大的实际意义。但是,关于数学问题的知识点自动标记的研究很少。与一般文本相比,数学文本具有更复杂的结构和语义,因为它们包含符号和公式之类的独特元素。因此,很难通过直接应用一般域中的文本分类技术来满足知识点预测的准确性要求。在本文中,K12数学问题是研究对象,提出了基于标签语义的关注和组合文本特征的多标签平滑的实验室模型,以改善数学问题知识点的自动标记。该模型将文本分类技术结合在通用域和数学文本的独特功能中。结果表明,使用标签语义注意力或多标签平滑度的模型在精度,召回和F1得分指标上的性能要比传统的BilstM模型更好,而实验室模型使用两者都表现最好。可以看出,标签信息可以指导神经网络从问题文本中提取有意义的信息,从而改善模型的文本分类性能。此外,结合文本功能的多标签平滑性可以充分探索文本和标签之间的关系,提高模型的新数据预测能力,并提高模型的分类精度。
translated by 谷歌翻译
许多研究都致力于学习公平代表的问题。但是,它们并未明确表示潜在表示之间的关系。在许多实际应用中,潜在表示之间可能存在因果关系。此外,大多数公平的表示学习方法都集中在群体级别的公平性上,并基于相关性,忽略了数据基础的因果关系。在这项工作中,我们从理论上证明,使用结构化表示可以使下游预测模型实现反事实公平,然后我们提出了反事实公平性变异自动编码器(CF-VAE)以获得有关领域知识的结构化表示。实验结果表明,所提出的方法比基准公平方法获得了更好的公平性和准确性性能。
translated by 谷歌翻译
游戏由多种类型的内容组成,而不同内容类型的和谐在游戏设计中起着至关重要的作用。但是,大多数关于程序内容生成的作品一次仅考虑一种类型的内容。在本文中,我们通过音乐提出并制定了从音乐中的在线水平生成,以实时的方式将级别功能与音乐功能匹配,同时适应玩家的比赛速度。一个通用框架通过强化学习为在线玩家自适应的程序内容生成,oparl for Short是建立在经验驱动的强化学习和可控制的强化学习的基础上的,以从音乐中获得在线水平的生成。此外,提出了基于本地搜索和K-Nearest邻居的新型控制策略,并将其集成到Oparl中,以控制在线收集的播放数据的水平发电机。基于仿真的实验的结果表明,我们实施Oparl有能力在在线方式以``Energy''动态的``能量''动态来生成可玩水平。
translated by 谷歌翻译